The files in this folder can be used to replicate all tables and figures in
"Who Punishes Extemist Nominees" by Andrew B. Hall and Daniel M. Thompson.
All analyses were conducted with Stata 14.0. Reproducing tables 1 through 6
and A2 through A8 and figures 2, 3, and A1 requires the "rdrobust" package.
This can be installed by entering 
	net install rdrobust, from(https://sites.google.com/site/rdpackages/rdrobust/stata) replace
into the Stata console. Figure A2 requires users to download the "DCdensity" 
package. Its creator, Justin McCrary describes how to install it on his website
here: https://eml.berkeley.edu/~jmccrary/DCdensity/. The figures created in R
require the tidyverse and haven packages which can be installed using 
	install.packages("tidyverse")
	install.packages("haven")


Note: The authors updated this replication file on Sept 24, 2024. An earlier
version of the replication data for this paper relied on a district variable in 
the CCES that assigned some respondents to the wrong congressional district in 
2012. This new version of the replication file corrects this error and reproduces
the results reported in the corrected version of the article. To reproduce all
tables and figures in the original published version of the article, use the 
versions of the datasets ending in "original" and available in the directory 
named "hall_thompson_original_files".


Table of Contents:

"who punishes extremist nominees replication.do" contains all of the code
necessary to replicate the tables and Figure A2. Most of the figures are
built in R, but some data munging and analysis is first done in Stata. Run
this script all the way through before running the code for producing the
figures in R.

"who punishes extremist nominees replication - figures.R" contains all 
of the code necessary to replicate the figures. "who punishes extremist
nominees replication.do" must be run to completion first for this file
to work.

"rd_analysis_hs.dta" is the main analysis dataset. It contains data on
general election results and turnout by district for districts with a
competitive primary. The distance between two primary candidates in the
same party is determined by the Hall-Snyder money-based scaling.

"rd_analysis.dta" is the same as "rd_analysis_hs.dta" but it uses
Adam Bonica's CF Score to measure the ideological distance between
candidates in a primary.

"rd_analysis_dyn.dta" is the same as "rd_analysis_hs.dta" but it uses
Adam Bonica's dynamic CF Score to measure the ideological distance between
candidates in a primary.

"rd_analysis_dime.dta" is the same as "rd_analysis_hs.dta" but it uses
Adam Bonica's DWDIME prediction of DW-NOMINATE to measure the ideological
distance between candidates in a primary.

"rd_analysis_shor_new.dta" is the same as "rd_analysis_hs.dta" but it uses
Boris Shor and Nolan McCarty's NP score to measure the ideological
distance between candidates in a primary.

"rd_analysis_no_leaners_hs.dta" is the same as "rd_analysis_hs.dta" but
it measures turnout without voters who lean Democratic or Republican
rather than identifying with one party outright.

"rd_analysis_vote_choice_hs.dta" is the same as "rd_analysis_hs.dta" but
it measures vote choice using the CCES stated vote choice for the House
race rather than the observed election results.

"panel_analysis.dta" is a panel dataset of turnout rates by party from the
CCES for each congression district over time together with the CF score 
of the candidates from each party.

"competitiveness_analysis.dta" is a panel of the competitiveness of US House
elections by congressional district along with the total voter turnout
in that race.
